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A METHOD AND AN APPARATUS FOR 
VISUAL SUMMARIZATION OF DOCUMENTS 

Field of the Invention 

The invention relates generally to displaying information on a graphic 
user interface ("GUI") and more specifically to displaying information in such a 
way as to quickly and easily commimicate information to a user, 
5 Description of Related Art 

Computers and other electronic devices with GUI's are used to 
commimicate information. A part of this commurucation process involves 
S displaying information on a GUI in an efficient manner. In maiiy retrieval and 
m browsing user interfaces, documents are represented by scaled-down images. 
Si 10 For example, if the document contains multiple pages, each page may be 

represented by a separate icon. If each page of a document is represented by an 
I J icon, many icons are needed to display a large document. This approach is 

generally too cumbersome to use. In an alternative approach, a single icon may 
be used to represent the entire document. Generally, the first page of the 
15 document is arbitrarily chosen to represent the document regardless of whether 
the visual appearance of the first page provides a visual cue for association with 
that particular document. It is therefore desirable to have a system to represent 
documents or other items such that information about a document or item is 
easily relayed to and xmderstandable by a user. 
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SUMMARY OF THE INVENTION 
A computer system is disclosed that comprises a display, a processor 
coupled to the display, and a memory coupled to the processor. Stored in the 
memory is a routine, which when executed by the processor, causes the 
5 processor to generate display data. The routine includes extracting at least one 
visual feature from a document having a plurality of pages, ranking the pages in 
the docimient according to the at least one visual feature, selecting a page for 
representing a document according to a rank, and displaying the selected page as 
the display data. Additional features, embodiments, and benefits will be evident 
10 in view of the figures and detailed description presented herein. 
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BRIEF DESCRIPTION OF THE DRAWINGS 
The features, aspects, and advantages of the invention will become more 
thoroughly apparent from the following detailed description, appended claims, 
and accompanying drawings in which: 
5 Figure lA illustrates thumbnails of all the pages in a document. 

Figure IB illustrates the first three pages. 

Figure 2A illustrates a first row of icons corresponding to the three most 
visually significant pages of a document; 

Figure 2B illustrates pages that contain distinctly different visual 
10 differences; 

Figure 3 illustrates a more compact representation of all the pages in a 
document; and 

Figure 4 illustrates one embodiment of a computer system. 
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DETAILED DESCRIPTION OF THE INVENTION 
A method and apparatus for generating and displaying a visual 
summarization of a document is described. In one embodiment, a technique 
described herein extracts visual features from the document and ranks multiple 
5 pages of a docximent based upon at least one or more visual features of the page. 
The pages may be presented on a graphical user interface (GUI) to a user with 
features being displayed that are ranked higher. 

Some portions of the detailed descriptions which follow are presented in 
terms of algorithms and sjmbolic representations of operations on data bits 

10 within a computer memory. These algorithmic descriptions and representations 
are the means used by those skilled in the data processing arts to most effectively 
convey the substance of their work to others skilled in the art. An algorithm is 
here, and generally, conceived to be a self-consistent sequence of steps leading to 
a desired result. The steps are those requiring physical manipulations of 

15 physical quantities. Usually, though not necessarily, these quantities take the 
form of electrical or magnetic signals capable of being stored, transferred, 
combined, compared, and otherwise marupulated. It has proven convenient at 
times, principally for reasons of common usage, to refer to these signals as bits, 
values, elements, symbols, characters, terms, nimibers, or the like. 

20 It should be borne in mind, however, that all of these and similar terms 

are to be associated with the appropriate physical quantities and are merely 
convenient labels applied to these quantities. Unless specifically stated 
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otherwise as apparent from the following discussion, it is appreciated that 
throughout the description, discussions utilizing terms such as "processing" or 
"computing" or "calculating" or "determining" or "displaying" or the like, refer to 
the action and processes of a computer system, or similar electronic computing 
5 device, that manipulates and transforms data represented as physical (electronic) 
quantities within the computer system's registers and memories into other data 
similarly represented as physical quantities within the computer system 
memories or registers or other such information storage, trarismission or display 
devices. 

J: 10 The present invention also relates to apparatus for performing the 

m operations herein. This apparatus may be specially constructed for the required 
'p^ purposes, or it may comprise a general purpose computer selectively activated or 
' " reconfigured by a computer program stored in the computer. Such a computer 
m program may be stored in a computer readable storage medium, such as, but is 
H= 15 not limited to, any type of disk including floppy disks, optical disks, CD-ROMs, 
and magnetic-optical disks, read-only memories (ROMs), random access 
memories (RAMs), EPROMs, EEPROMs, magnetic or optical cards, or any type 
of media suitable for storing electronic instructions, and each coupled to a 
computer system bus. 
20 The algorithms and displays presented herein are not inherently related to 

any particular computer or other apparatus. Various general purpose systems 
may be used with programs in accordance with the teachings herein, or it may 
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prove convenient to construct more specialized apparatus to perform the 
required method steps. The required structure for a variety of these systems will 
appear from the description below. In addition, the present invention is not 
described with reference to any particular programming language. It will be 
5 appreciated that a variety of programming languages may be used to implement 
the teachings of the invention as described herein. 

A machine-readable medium includes any mechanism for storing or 
transmitting information in a form readable by a machine (e.g., a computer). For 
example, a machine-readable medium includes read only memory ("ROM''); 

;| 10 random access memory ("RAM"); magnetic disk storage media; optical storage 
media; flash memory devices; electrical, optical, acoustical or other form of 

'^^ propagated signals (e.g., carrier waves, infrared signals, digital signals, etc.); etc. 

\^\ Overview 

H 15 Techniques described herein provide a scheme to rank page icons (e.g., 

thumbnails) according to their visual saliency. The rankings may be used to 
select certain pages, preferably those with more salient features, for display. This 
solution may result in increasing the ease of document recall as opposed to 
display only the first set of pages of a document with reduced sized images to 
20 provide a visual clue as to the contents of a document. Additionally, techniques 
described herein also provide for various effective representations of document 
content in applications with limited display size. 
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Figure 1 A illustrates thumbnails of all the pages in a document. Figure IB 
illustrates the first three pages, arbitrarily c±iosen, does not help recall the 
docimient. By showing pages with the most salient features, as shown in Figure 
2A, or pages with distinctly different visual appearances, as shown in Figure 2B, 
5 a user is provided, generally, with more information to recall a particular 
document. Utilizing visual saliency and distinctive features, a more compact 
representation of all pages in a doomient may be obtained as illustrated in 
Figure 3. 

The representations in Figures 2A, 2B, and 3, and other suitable 
10 representations, are possible using a combination of components such as, for 
example, features that describe visual characteristics of a document image, 
feature extraction and representation scheme, and a measure of visual saliency. 
Each of these features are described below. 

A set of features capable of describing the visual characteristics of a 
15 docxmient image include textural and layout feature information. Textural 
features may include one or more of position, size, ink density, line spacing, 
color and contrast. Layout features may include one or more of configuration of 
blocks (e.g., column, header, etc.) or types of blocks (e.g., picture, line art, text, 
etc.). Features that are known to play a significant role in human perception and 
20 memory, such as, for example, surroimding space, letter height, bold, bullets, 
indentation, all capitalization, italics, tmderlining and other suitable features. 
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The features extraction/representation scheme component involves the 
use of document analysis systems that are capable of segmenting blocks, 
detecting font sizes within blocks, and extracting other relevant information, 
such as, for example, the textural and layout features described above. Although 
visual information is naturally conveyed by a description language, in one 
embodiment a vector representation is used instead to facilitate applications of 
various techniques developed for information retrieval. 

The measure of visual saliency may be based upon a variety of factors 
such as, for example, psychological experiments that provide some guidelines for 
designing this component. For instance, it has been determined that pictures 
tend to draw more attention than text blocks and character size is more 
significant than character style. The presence of attractive features contributes to 
the total visual saliency of the page. Optionally, this visual saliency component 
can also be normalized using schemes similar to term weighting for text retrieval 
to accoimt for features common to all documents in a database. 

Utilizing these components, visual features are first extracted for all pages 
in a database using methods known in the art. Pages in a document are then 
ranked according to their visual significance and imiqueness. The user or system 
designer may determine which visual features are significant or imique. Since 
the number of different visual features may be quite large, the visual features 
chosen by a user or a system designer may also be quite large. The ranking 
serves as the basis for the selection of representing icons in Figure 2A. 
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In addition to ranking pages, visual features may also be used to provide a 
distance measure between documents. If the visual features are represented in 
vector form, as is typically done in image-based retrieval techniques, 
conventional information retrieval techniques as developed for a vector space 
model may be applied to produce effective iconic document representations. For 
example, clustering of the pages may reveal distinct page types as shown in 
Figure 2B. While clustering of images is commonly performed as a navigation 
aid to find similar documents, clustering is used within a document having 
multiple pages. This is analogous to finding "keyframes" in a video. 

Treating pages in a document as frames in a sequence may also lead to 
compact representations. "Scene changes" can be detected by comparing the 
visual distance between two consecutive pages to a threshold, by looking for 
transitions to different page types subsequent to clustering as described above, 
or by other variations such as, for example, combining visual saliency scores. 
When the distance between consecutive pages is very small, only one of the two 
needs to be selected. Sequence of visually similar or uninteresting pages may be 
stacked to reduce space required as illustrated in Figure 3. It will be appreciated 
that these components may be utilized independently or in combination with 
one another to create other novel usages. 
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An Exemplary Algorithm 

The visual summarization system described herein uses a source 
docimient as input. In the first phase of the process, a ntimber of, for example, 
color bitmaps are generated. Each bitmap represents a separate page of the 
5 source document. Visual features are then extracted from these bitmaps using 
document analysis techniques. Two functions Saliency and VisualDist defined 
over these features enable the effects shown in Figures 2A, 2B, 3. 

The techniques described herein may operate on a variety of document 
types by using the feature extraction process that is assumed to utilize common 
10 commercial optical character recognition (OCR) systems and operates on the 
most common denominator for document representation: image bitmaps. The 
bitmap generation process is described for several common document formats, 
such as, for example, paper documents, postscript, portable document format 
(PDF), hypertext markup language (HTML) and Word documents. Although it 
15 is also possible to develop feature extraction modules designed specifically for 
each document type, using a common representation simplifies the algorithm 
description. Generalization to other document media may also be similarly 
derived. 

20 Bitmap Generation 

Generating a bitmap can be used for any type of computer-generated 
source document. However, on occasion it may be more efficient or convenient 
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to use a specific method based on a particular type of source document. The 
following description provides a general method and several type-specific 
methods. 

On an operating system ("OS") such as Microsoft Windows, a printer 
5 driver is a software application that translates rendering commands from some 
controlling application into a printable representation of a doctiment. A user 
typically has installed one printer driver for each different type of printer in 
which access is granted. 

Given a source document S generated by application A, the general 
10 methodology operates as follows. The user rxms application A, loads document 
and selects the "print" function. The user then selects a printer driver that, 
instead of sending its output to a printer, creates a number of color bitmap 
images. The document is paginated just as if it was to be printed. The user 
optionally has control of font sizes and target paper size, depending on the 
15 application A. 

Techniques for creating such a printer driver are known in the art since it 
does not differ significantly from any other printer driver. It is assumed that a 
bitmap corresponds to a page intended for a printer according to a default dots- 
per-inch factor. Therefore, an 8.5x11" page corresponds to, for example, a 
20 612x792 bitmap with a 72dpi factor. 

In an alternative embodiment, the user selects an existing printer driver 
that generates Postscripts^ output (such drivers are commonly available as part 
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of an OS or through suppliers such as Adobe Inc), and selects the "Print to File" 
option. In this way, a postscript file can be generated from an arbitrary source 
document. This postscript file, in turn, can be transformed into a number of 
bitmap images. 

5 Tools for using HTML to create bitmap images are known in the art. Such 

tools are available from Sun such as Hotjava™, Microsoft such as Internet 
Explorer ActiveX^^ control and AOL such as Netscape Mozilla^^ project. Such a 
tool can further use Dynamic HTML, XML, Style Sheets and further markup 
languages. 

510 In using HTML, there are two choices that determine the size of the final 

m output: target page width and font size. One page width to select is the screen 
y resolution width of an average user, for instance 800 pixels. An alternative is to 
assume the width of a standard letter-size page, 8.5 inches. Similarly, font size 
l/l can be chosen to match the default setting on a standard Web browser, e.g., 12 

y 

N= 1 5 point Times Roman font for variable-width characters. 

Tools for rendering PDF files are known in the art. Since PDF includes 
information about page size, orientation, and font size, no further information is 
required. 

Tools for rendering Postcript^^ files are known in the art. Since 
20 Postcript^^ includes information about page size, orientation, and font size, no 
further information is required. 
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In addition to the methods above that relate to computer-generated 
documents, any paper document can also be used as input. A scanning device 
which is known in the art can turn the paper document directly into a color 
bitmap per page. 

5 

Feature Extraction 

After image bitmaps are obtained for individual document pages, 
conventional document analysis techniques may be applied to extract visual 
features. Commercial OCR systems such as Xerox ScanWorX commonly provide 

10 basic layout information and character interpretations. A single document page 
is often decomposed into blocks of text, pictures, or figures. For text blocks, 
word boimding boxes and font size are estimated. Since most commercial 
systems operate on binary or gray scale images, color images can be converted to 
a monochrome version first for block analysis. Color constituents can be 

15 subsequently extracted by superimposing the color image with segmented block 
information. 

The end result of document analysis is a set of feature descriptions for 
each document page. More specifically, for each page, a list of segmented blocks 
is obtained. Each segmented block is categorized as text, a picture, or line art. 
20 The location and color composition of each block are also known. In order to 
proceed to use the algorithm described above, a suitable representation should 
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be chosen. Therefore, it is assumed that a simple statistical representation is 
used, although other representations, even symbolic, are also possible. 

A document image is divided into mxn grids. For each imiquely 
numbered square in the grid, g.,l<i<m'j% five features are recorded. The first 
5 three features, t^, p^, and/, indicate portions of the grid area which overlap with a 
text, picture or line art block, respectively. For example, if entire area imder the 
grid belongs to a text block, = 1, =/. = 0. If the left one third area overlaps a 
text block, the right one third overlaps a picture, and the middle one third 
contains white backgroimd, then t. = p^ = 033 and/ = 0. The next two features, b. 
,Q 10 and c., contain the color information of grid content. Colors may be represented 
jS by their brightness, hue, and saturation attributes. The brightness attribute 
^2 represents the observed luminance and is monochromatic. The hue attribute 
7 indicates the degree of "redness" or "greermess". The saturation attribute reflects 
m the pureness of the hue. Although human perception is more sensitive to certain 
^ 15 color tone than others, it is assximed that visual significance is independent of the 
hue in the simplified representation and only the "average brightness" and 
"average color pureness" is recorded. Feature b. measures the average 
"blackness" inside a grid. More precisely, it is the average brightness value for 
pixels in the grid in reverse and normalized such that if all pixels inside a grid 
20 are pure black, =1. This feature is equivalent to the "ink density" feature 

frequently used in conventional document analysis of bitonal images. Feature c. 
is the average saturation value for pixels in the grid, also normalized between 0 
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and 1. Therefore, a grayscale image has only a brightness value but no 
saturation attribute. In contrast, a grid containing color pixels will have a 
non-zero c, value. 

Consequently, the visual information in a given page is represented by a 

> 

5 vector v with dimension 5 * m * n, which can be considered as a concatenation of 

> > > > > 

5 vectors t,p,f ,b,c each of m * n dimensions. A document consisting of k pages 

> > 

will be represented by k vectors Vj ... . Elements in these vectors all have 
values between 0 and 1. However, they do not have to sum to 1. 

10 Visual Saliency Evaluation 

The simplest form of visual saliency is evaluated on a per-page basis 
independent of other pages in the same doctmient or database. This is achieved 
by assigning a weight to each visual features. For example, since colors are more 
noticeable than grays, and pictures are more visually significant than line arts 
15 and text, a reasonable weighting for the 5 features is = 0.1, iy^= 0.4. w^= 1, = 
0.8, = 2. The saliency score for a page is then computed as 

/ / i i i 

Although, in this example, the weights are applied imiformly across the 
page, the weights may be made to reflect the positional variance in human 
20 perception. For instance, different weights may be assigned to w^i) depending 
on the location of (i) to emphasize the significance of colors when occurring in 
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the middle of a page versus on the top or bottom of a page. Therefore, a more 
general equation for saliency is 

Saliency( y) =J^w, (i) • + £ (0 • ^ + -p, +Y.w, (i) 'k-\-J^w, (i) ■ q 

i i i i i 

Using the function Saliency, pages in a document can thus be ranked 
5 according to visual distinctiveness, and selected to represent the document, as 
shown in Figure 2A* 



Relative Saliency 

Since one purpose of using visually salient icons is to aid the retrieval of 
10 documents, in one embodiment, the icon selection criterion considers common 
characteristics of other documents in the collection of documents. For example, 
the significance of a page containing a red picture in one corner is diminished if 
all pages in the database have the same characteristic. This situation is quite 
possible in special collections where all documents contain the same logo or 
15 other tj^es of marking. This problem is known in information retrieval and is 
typically dealt with by incorporating a database norm into the equation. By 
using a centroid subtraction method, similar types of correction mechanisms 
may be applied to the techniques described herein. 

Given a collection of documents, the centroid is the average visual feature 
20 vector of all pages. To disco\mt properties common to all documents in the 
database, the centroid is subtracted from individual feature vectors before 
saliency calculation. In other words. 
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RelSaliency(y) = Saliency 



where u is the centroid vector. Thus, in one embodiment, saliency is evaluated 
based on features that are "out-of-normal" in the database. Using the example 
presented above, if all pages in the database contain a red picture at grid position 
f, then the average value of c. will be fairly high. Therefore, a page that does not 
have a red picture in the comer should be more noticeable. In this case, if c. = 0 
in this page, which a high value will result after subtracting the average c. in the 
centroid. In this example, since we are ignoring hue, a page that has a picture in 
that position, regardless of color, will have a high q value. In contrast, a page 
that does not have any color in that position will stand out. 



Visual Distance 

f I To measure the visual difference between two pages, the Saliency function 

may be applied to the absolute values of the differences between corresponding 
15 features. 



VisualDist(yuW2)= Saliencyi 



> > 
V1-V2 



VisualDist takes a grid by grid accoimting of the discrepancies in texture 
and color between at least two pages and then assesses the visual saliency of the 



total difference. The portion 



> > 

Vl- V2 



generates a vector whose elements are all 



20 between 0 and 1. While the norm is most frequently used (or misused) to 
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measure the distance between two vectors regardless whether a uniform 
numeric scale applies to all components, this measure appears to be more 
suitable to describing what a visual difference is and how visually significant 
that difference may be. 
5 One application of the visual distance is to produce a condensed 

representation of a multi-page document, as shown in Figure 3. The visual 
difference between every two consecutive pages determines the amoimt of 
overlapping that exists; therefore, only significantly different-looking pages are 
shown in full. Since VisualDist is a distance metric, it can be used to cluster all 

■^Q 10 pages in a document, or pages in a collection of documents. Pages are first 

& grouped by their visual similarities. Thereafter, an exemplar page for each 

cluster is selected by picking the page whose feature vector is closest to the 

J" cluster center. This produces the exemplar pages of a document as seen in 

III Figure 2B. 

Icon Display 

It will be appreciated that although Figures 2A, 2B, 3 illustrate example of 
displays of icons created using the techniques of the invention, other 
arrangements are possible. The scheme may adapt to the amoimt of space 
20 available by picking out a smaller or a larger nimiber of page icons. The amoimt 
of space can be specified by an external constraint (e.g., physical display size), by 
a system designer, or by a user {e.g., if the icon is displayed within a resizable 
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box), the amount of space can also be a variable. For example, the number of 
page icons that are shown may depend on the number of clusters found within 
the document, the length of the document, the nimiber of pages whose visual 
saliency is above some predetermined threshold, or the connection bandwidth. 

The scheme of icons may adapt to the shape of the space available. Figure 
3 shows a linear display. The same information may be shown as a sequence of 
lines of page icons, for a square or rectangular shape, or as stacks of 
distinct-looking pages. Altematively, the icons may be arranged around a circle 
or oval. In general, an ordered set of icons may follow any arbitrary path. 

The generated icons are suitable for use in a graphical user interface, 
where they can be generated on-the-fly, for printed use, where they are 
generated ahead of time, or for use on the Web or in multimedia presentation 
formats. 

Figure 4 illustrates one embodiment of a computer system 10 which 
implements the principles of the present invention. Computer system 10 
comprises a processor 17, a memory 18, and interconnect 15 such as a bus or a 
point-to-point link. Processor 17 is coupled to memory 18 by interconnect 15. In 
addition, a nimnber of user input/ output devices, such as a keyboard 20 and 
display 25, are coupled to a chip set (not shown) which is then connected to 
processor 17, The chipset (not shown) is typically cormected to processor 17 
using an interconnect that is separate from interconnect 15. 
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Processor 17 represents a central processing imit of any type of 
architecture {e.g., the Intel architecture, Hewlett Packard architecture. Sun 
Microsystems architecture, IBM architecture, etc.), or hybrid architecture. In 
addition, processor 17 could be implemented on one or more chips. Memory 18 
5 represents one or more mechanisms for storing data such as the number of times 
the code is checked and the results of checking the code. Memory 18 may 
include read only memory ("ROM")/ random access memory ("RAM"), magnetic 
disk storage mediums, optical storage mediums, flash memory devices, and/or 
other machine-readable mediums. In one embodiment, intercormect 15 
10 represents one or more buses {e.g., accelerated graphics port bus, peripheral 
component interconnect bus, industry standard architecture bus, X-Bus, video 
electronics standards association related to buses, etc.) and bridges (also termed 
bus controllers). 

While this embodiment is described in relation to a single processor 
15 computer system, the invention could be implemented in a multi-processor 
computer system or enviroiraient. In addition to other devices, one or more of 
network 30 may be present. Network 30 represents one or more network 
connections for transmitting data over a machine readable media. The invention 
could also be implemented on multiple computers connected via such a network. 
20 Figure 4 also illustrates that memory 18 has stored therein data 35 and 

program instructions {e.g. software, computer program, etc.) 36, Data 35 
represents data stored in one or more of the formats described herein. Program 



074451.P104 



20 



instructions 36 represent the necessary code for performing any and /or all of the 
techniques described with reference to Figures 2A, 2B and 3 do. Program 
instructions may be stored in a computer readable storage medium, such as any 
type of disk including floppy disks, optical disks, CD-ROMs, and magnetic- 
5 optical disks, ROMs, RAMs, erasable programmable read only memories 
("EPROM^s), electrically erasable programmable memories ("EEPROM^'s), 
magnetic or optical cards, or any t5rpe of media suitable for storing electronic 
instructions, and each coupled to a computer system bus. It will be recognized 
by one of ordinary skill in the art that memory 18 preferably contains additional 

y3 10 software (not shown), which is not necessary to understanding the invention. 

n Figure 4 additionally illustrates that processor 17 includes decoder 40. 

Decoder 40 is used for decoding instructions received by processor 17 into 

I'' control signals and/ or microcode entry points. In response to these control 

m signals and/or microcode entry points, decoder 40 performs the appropriate 

^^15 operations. 

In the preceding detailed description, the invention is described with 
reference to specific embodiments thereof. It will, however, be evident that 
various modifications and changes may be made thereto without departing from 
the broader spirit and scope of the invention as set forth in the claims. The 
20 specification and drawings are, accordingly, to be regarded in an illustrative 
rather than a restrictive sense. 
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CLAIMS 

What is claimed is: 

1 1. A method comprising: 

2 extracting at least one visual feature from a document, the document 

3 having a plurality of pages; 

4 ranking the pages in a document based on the at least one visual feature; 

5 and 

6 display pages based on ranking, 

1 2. The method of claim 1 wherein a plurality of visual features are 

2 used in ranking and at least one visual feature is weighted. 

1 3. The method of claim 1 wherein the visual feature is one of a 

2 picture, a text block, a character size, a character style, and a color 

1 4. The method of claim 1 wherein a visual feature is weighted based 

2 on gradations of the visual feature. 

1 5. The method of claim 1 wherein a visual feature is weighted based 

2 on location of the visual feature on a page. 
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1 6. The method of claim 1 wherein the visual feature is represented in 

2 a vector form. 

1 7. The method of claim 1 wherein the visual feature is used as a 

2 distance measure between a first document and a second document. 

1 8. The method of claim 1, further comprising clustering of a plurality 

2 of pages within a document. 

,Q 1 9. The method of claim 1, further comprising using visual features to 

n 2 reveal a transition from a first page to a second page of a document. 

1 10. The method of claim 1, wherein ranking of the pages includes a 

is;;? 

m 2 correction mechanism. 

1 11, The method of claim 1, wherein a scheme showing one of a 

2 plurality of pages in a document and a plurality of documents is by one of a 

3 linear display, a line of icons, and as a stack. 

1 12. A computer system comprising: 

2 a display; 

3 a processor coupled to the display; and 
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4 a memory coupled to the processor and having stored therein a routine, 

5 which when executed by the processor, causes the processor to generate display 

6 data through: 

7 extracting at least one visual feature from a document, the 

8 document having a plurality of pages, 

9 ranking the pages in the document according to the at least one 

10 visual feature, 

1 1 selecting a page for representing a document according to a rank, 

12 and 

13 displaying the selected page as the display data. 

1 13. The computer system of claim 12 wherein a plurality of visual 

2 features are used in ranking and at least one visual feature is weighted. 

1 14. The computer system of claim 13 wherein the visual feature is one 

2 of a picture, a text block, a character size, a character style, and a color. 

1 15. The computer system of claim 13 wherein a visual feature is 

2 weighted based upon gradations of the visual feature. 

1 16. The computer system of claim 13 wherein the visual feature is 

2 represented in a vector form. 
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1 17. The computer system of claim 13 wherein the visual feature is used 

2 as a distarice measure between a first document and a second document- 

1 18. The computer system of claim 12, wherein generating display data 

2 further comprises clustering of a plurality of pages within a document. 

1 19. The computer system of claim 12, wherein a plurality of pages are 

2 selected and generating display data further comprises using visual features to 

3 reveal a transition from a first page to a second page of a document. 

1 21. The computer system of claim 12, wherein ranking the pages 

2 includes a correction mechanism. 

1 22. The computer system of claim 12, wherein a scheme showing one 



2 of a plurality of pages in a document and a plurality of documents is by one of a 

3 linear display, a line of icons, and as a stack. 



1 23. An article of manufacture having at least one machine readable 

2 storage media containing executable program instructions which when executed 

3 by a digital processing system cause the digital processing system to: 

4 extract at least one visual feature from a document, the document having 

5 a plurality of pages. 
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6 rank pages in the document based on said at least one visual feature, 

7 select the pages for representing a document based on ranking, and 

8 display selected pages. 

1 24. The machine readable storage media of claim 23, wherein a 



2 plurality of visual features are used in ranking and at least one visual feature is 

3 weighted. 



1 25. The machine readable storage media of claim 23, wherein the visual 

2 feature is one of a picture, a text block, a character size, a character style, and a 

3 color. 

1 26. The machine readable storage media of claim 25, wherein a visual 

2 feature is weighted based upon gradations of the visual feature. 

1 27. The machine readable storage media of claim 23, wherein the visual 

2 feature is represented in a vector form. 

1 28. The machine readable storage media of claim 23, wherein the visual 

2 feature is used as a distance measure between a first document and a second 

3 document. 
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1 29. The machine readable storage media of claim 23, further 

2 comprising clustering of a plurality of pages within a document. 

1 30. The machine readable storage media of claim 23, further 

2 comprising using visual features to reveal a transition from a first page to a 

3 second page of a document. 



1 32. The machine readable storage media of claim 23, wherein ranking 

2 the pages includes a correction mechanism. 

1 33. The machine readable storage media of claim 23, wherein a scheme 

2 showing one of a plurality of pages in a document and a plurality of documents 

3 is by one of a linear display, a line of icons, and as a stack. 



1 34. A method comprising: 

2 extracting at least one visual feature from a document, the document 

3 having a plurality of pages; 

4 grouping pages in a document using a plurality of visual features; 

5 selecting representative pages from groups; and 

6 representing the pages. 
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ABSTRACT OF THE DISCLOSURE 
A system and a method for visually summarizing a document comprising 
a display, a processor coupled to the display, and a memory coupled to the 
processor. Stored in the memory is a routine, which when executed by the 
processor, causes the processor to generate display data. The routine causes the 
processor to generate data through extracting at least one visual feature from a 
docxmient having a plurality of pages, ranking the pages in a document, selecting 
a page for representing a document according to the visual feature, and 
displaying the selected page as display data. 
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first, and joint inventor (if plural names are listed below) of the subject matter which is claimed and 
for which a patent is sought on the invention entitled 

"A METHOD AND AN APPARATUS FOR VISUAL SUMMARIZATION OF 

DOCUMENTS" 

the specification of which 

X is attached hereto. 

was filed on as 

United States Application Number 

or PCT International Application Number 

and was amended on . 

(if applicable) 

I hereby state that I have reviewed and understand the contents of the above-identified 
specification, including the claim(s), as amended by any amendment referred to above. I do not 
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America before my invention thereof, or patented or described in any printed publication in any 
country before my invention thereof or more than one year prior to this application, that the same 
was not in public use or on sale in the United States of America more than one year prior to this 
application, and that the invention has not been patented or made the subject of an inventor's 
certificate issued before the date of this application in any country foreign to the United States of 
America on an application filed by me or my legal representatives or assigns more than twelve 
months (for a utility patent application) or six months (for a design patent application) prior to this 
application. 

I acknowledge the duty to disclose all information known to me to be material to patentability as 
defined in Title 37, Code of Federal Regulations, Section 1 .56. 

I hereby claim foreign priority benefits under Title 35, United States Code, Section 119(a)-(d), of any 
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known to me to be material to patentability as defined in Title 37, Code of Federal Regulations, 
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(Application Number) Filing Date (Status - patented, 
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statements were made with the knowledge that willful false statements and the like so made 
are punishable by fine or imprisonment, or both, under Section 1001 of Title 18 of the United 
States Code and that such willful false statements may jeopardize the validity of the 
application or any patent issued thereon. 
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Full Name of Third/Joint Inventor . 



Inventor's Signature Date . 

Residence Citizenship . 



(City, State) (Country) 
Post Office Address 
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Reg. No. 25,128; Judith A. Szepesi, Reg. No. 39,393; Vincent P. Tassinari, Reg. No. 42,179; Edwin H. 
Taylor, Reg. No. 25,129; John F. Travis, Reg. No. 43,203; George G. C. Tseng, Reg. No. 41,355; Joseph 
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APPENDIX B 



Title 37, Code of Federal Regulations, Section 1 .56 
Duty to Disclose Information Material to Patentability 

(a) A patent by its very nature is affected with a public interest. The public interest is best served, 
and the most effective patent examination occurs when, at the time an application is being examined, the 
Office is aware of and evaluates the teachings of all information material to patentability. Each individual 
associated with the filing and prosecution of a patent application has a duty of candor and good faith in 
dealing with the Office, which includes a duty to disclose to the Office all infonmation known to that individual 
to be material to patentability as defined in this section. The duty to disclosure information exists with respect 
to each pending claim until the claim is cancelled or withdrawn from consideration, or the application becomes 
abandoned. Information material to the patentability of a claim that is cancelled or withdrawn from 
consideration need not be submitted if the information is not material to the patentability of any claim 
remaining under consideration in the application. There is no duty to submit infomiation which is not material 
to the patentability of any existing claim. The duty to disclosure all information known to be material to 
patentability is deemed to be satisfied if all information known to be material to patentability of any claim 
issued in a patent was cited by the Office or submitted to the Office in the manner prescribed by §§1.97(b)-{d) 
and 1.98. However, no patent will be granted on an application in connection with which fraud on the Office 
was practiced or attempted or the duty of disclosure was violated through bad faith or intentional misconduct. 
The Office encourages applicants to carefully examine: 

(1 ) Prior art cited in search reports of a foreign patent office in a counterpart application, and 

(2) The closest information over which individuals associated with the filing or prosecution of a 
patent application believe any pending claim patentably defines, to make sure that any material information 
contained therein is disclosed to the Office. 

(b) Under this section, information is material to patentability when it is not cumulative to 
information already of record or being made or record in the application, and 

(1) It establishes, by itself or in combination with other information, a prima facie case of 
unpatentability of a claim; or 

(2) It refutes, or is inconsistent with, a position the applicant takes in: 

(i) Opposing an argument of unpatentability relied on by the Office, or 

(ii) Asserting an argument of patentability. 

A prima facie case of unpatentability is established when the information compels a conclusion that a claim is 
unpatentable under the preponderance of evidence, burden-of -proof standard, giving each term in the claim 
its broadest reasonable construction consistent with the specification, and before any consideration is given to 
evidence which may be submitted in an attempt to establish a contrary conclusion of patentability. 

(c) Individuals associated with the filing or prosecution of a patent application within the 
meaning of this section are: 

(1 ) Each inventor named in the application; 

(2) Each attorney or agent who prepares or prosecutes the application; and 

(3) Every other person who is substantively involved in the preparation or prosecution of the 
application and who is associated with the inventor, with the assignee or with anyone to whom there is an 
obligation to assign the application. 

(d) Individuals other than the attorney, agent or inventor may comply with this section by 
disclosing information to the attorney, agent, or inventor. 



Rev. 02/07/00 (D1) 



-5- 



